Skip to content

Fix cache hash for user-defined record types#7184

Closed
jorgee wants to merge 1 commit into
masterfrom
fix-typed-record-hashing
Closed

Fix cache hash for user-defined record types#7184
jorgee wants to merge 1 commit into
masterfrom
fix-typed-record-hashing

Conversation

@jorgee

@jorgee jorgee commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Typed records — those declared via record … { … } and instantiated through their generated constructor — fell through the type dispatch in HashBuilder to the catch-all that uses value.hashCode(). Because the generated record classes don't override hashCode, every instance produced a different identity-based hash, and -resume always re-ran the task even when inputs were equal.

The user-visible symptom in -dump-hashes is a record value rendered as _nf_script_xxx.Dataset@20d692ba (the JVM default Object.toString()), with a different @<hex> suffix on every run.

This PR adds a Record branch to HashBuilder.with(Object) that reflectively reads the record's public, non-static instance fields into a LinkedHashMap<String,Object> and hashes it through the same hashUnorderedCollection path used for Map/RecordMap. Net effects:

  • A typed record whose fields haven't changed hashes the same across runs → -resume works.
  • A record built with the record(...) stdlib helper and a typed-record constructor with the same field values produce the same hash.
  • The hash is independent of named-argument / field-declaration order (sum of per-entry hashes is commutative).
  • The new branch is placed after the Map branch so RecordMap (which is both Record and Map) keeps its existing hashing path.

Test plan

New unit tests in HashBuilderTest:

  • should hash typed record by field values, independent of JVM identity — regression test for the -resume bug.
  • typed record and equivalent RecordMap should hash to the same value
  • record built with the record(...) built-in should hash the same as the typed constructor
  • record(...) built-in hash should not depend on the order of named arguments
  • nested typed records should hash by value — recursion through the new branch.
  • nested typed record and nested RecordMap should hash equally

./gradlew :nf-commons:test --tests "nextflow.util.HashBuilderTest" — 17/17 pass.
CacheHelperTest, RecordMapTest, TypeHelperTest — unaffected.

🤖 Generated with Claude Code

Typed records (declared via `record … { … }`) instantiated by their
generated constructor fell through the type dispatch in `HashBuilder`
to the catch-all that uses `value.hashCode()`. Because the generated
record classes don't override `hashCode`, every instance produced a
different identity-based hash and `-resume` always re-ran the task.

This adds a `Record` branch that reflectively extracts a record's
public, non-static fields into a map and hashes it through the same
unordered-collection path used by `Map`/`RecordMap`. As a result,
records built via the `record(...)` built-in and records instantiated
from a typed declaration produce the same hash when their field values
are equal, regardless of insertion order or JVM identity.

Signed-off-by: Jorge Ejarque <jorge.ejarque@seqera.io>
Signed-off-by: jorgee <jorge.ejarque@seqera.io>
@netlify

netlify Bot commented May 28, 2026

Copy link
Copy Markdown

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit b500406
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/6a17fd19bf66bf0008130c4c

@ewels ewels requested a review from bentsherman May 28, 2026 08:37
Comment on lines +159 to +160
else if( value instanceof Record )
hashUnorderedCollection(hasher, recordToMap((Record) value).entrySet(), mode);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every record instance is a RecordMap, so it should be caught by the Map case above

Comment on lines +178 to +181
def r1 = new SampleRecord('alpha', 1)
def r2 = new SampleRecord('alpha', 1)
def r3 = new SampleRecord('beta', 1)
def r4 = new SampleRecord('alpha', 2)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Records cannot be created via constructor. But it may be that this happens to work and is only disallowed by the type checker

@bentsherman

Copy link
Copy Markdown
Member

Closing in favor of #7185

@bentsherman bentsherman deleted the fix-typed-record-hashing branch May 28, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants